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ABSTRACT 

We present a new general procedure for determining a given set of quantities. To 
this end, we define certain statistic, that we call 'modified x^' (Xm)j because of its 
similarity with the standard x^- The terms of this Xm ^'^^ rnade up of the fluctuations 
of an unbiased estimator of some statistical quantities, and certain weights. Only the 
diagonal terms of the covariance matrix explicitly appear in our statistic, while the 
full covariance matrix (and not its inverse) is implicitly included in the calculation 
of the weights. Choosing these weights we may obtain, through minimising the Xm: 
the estimator that provides the minimum RMS, either for those quantities or for the 
parameters on which these quantities depend. In this paper, we describe our method 
in the context of Cosmic Microwave Background experiments, in order to obtain either 
the statistical properties of the maps, or the cosmological parameters. The test here 
is constructed out of some estimator of the two-point correlation function at different 
angles. For the problem of one parameter estimation, we show that our method has 
the same power as the maximum likelihood method. We have also applied this method 
to Monte Carlo simulations of the COBE-DMR data, as well as to the actual 4-year 
data, obtaining consistent results with previous analyses. We also provide a very good 
analytical approximation to the distribution function of our statistic, which could also 
be useful in other contexts. 
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1 INTRODUCTION 

The study of the Cosmic Microwave Background (CMB) 
anisotropies is providing strong constraints on theories of 
structure formation. These theories are statistical in essence, 
so the extraction of the information must be done in a statis- 
tical way. In particular, the standard method for analysing 
a CMB experiment is the maximum likelihood estimator 
(ML). The procedure is straightforward: maximise the prob- 
ability of the parameters of the model given the data, 
P (parameters! data), over the allowed parameter space. Usu- 
ally, we take the prior probability for the parameters to be 
constant, so this is equivalent to maximising the likelihood, 
P(data| parameters), via the Bayes' theorem. 

The ML method has been widely applied in CMB 
analy ses, for power spectrum or parameters estimation, 
iPav ies ct al. 1987; Gorski 1994; Hinshaw ot al. 1996a). 
When computing the likelihood in these problems, we have 
to deal with the inversion of the covariance matrix of 
the data, which usually involves 0{N^) operations, being 
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A'^ the number of pixels of the map. The increasing size 
of the datasets makes this method computationally cost- 
full for new experiments, so other methods have been in- 
vestigated in the last few years to confront the problem. 
There have been several proposals on this matter. Pioneer- 
ing work on t he proble m of power s pectrum estimation 
iHauser fc Peebles. .19731 : iPeeblesI Il973^ , based on an eval- 
uation of the aim's coefficients of the multipole expansion 
of the observed map in the spherical harmonics basis, have 
been applied to COBE data ( Wright ct al. 1996). Q uadratic 
estimators have been proposed b y several authors llTegmarkI 
Il997t iBond. Jaffe. fc Knoxlll998l) as statistics that give the 
same parameters that maximise the likelihood, but requiring 
less computational work. 

Nevertheless, alternative statistical methods are re- 
quired in the field to extract the cosmological information 
from future CMB experiments (as PLANCK) wh ere the 
numb er of data points will be very large (see, e.g. iBorrilJ 
1^9^ for an estimation of the scaling of the computing 
time with the dataset size). 

Here, we propose a new statistical method to analyse 
a CMB map. In order to illustrate it, we will use the two- 
point correlation function (CF). We first replace the likeli- 
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hood of the fuU map by the hkeUhood of the fluctuations 
of an estimator of the CF. Then, we derive the cosmological 
parameters from it in an efficient manner. If we assume gaus- 
sianity for the primordial CMB fluctuations, the CF com- 
pletely chaxacterises the statistic al properties of the field. In 
this l ine, it has been suggested (iBashinskv fc Bertschineeil 
I2OOII) that it can be used to obtain the power spectrum or for 
parameter estimation, because it encodes all the relevant in- 
formation for that purpose. This approach of considering the 
CF in CMB analyses has been recently used by other authors 
JSzapudi et al.ll2001al : iKashUnskv et alJbOOlt lSzapudi et"ai] 
l2001b^ to estimate the power spectrum. They obtain the 
CF using different estimators, and integrate it, projecting 
over the Legendre polynomials, to obtain the C^'s. The ad- 
vantage of this estimator is that it only needs at the most 
0{N^) operations to be computed, and not 0{N^), as is 
required for the likelihood. 

We construct a modified version of the standard test, 
using the CF evaluated at a certain set of points. The esti- 
mate of the parameters of the model is given by the mini- 
mum of this statistic, as in the standard analysis. We give 
a very good approximation to the distribution function of 
this modified x^y so the confidence limits can be obtained 
without using simulations, by integration below that curve, 
as for the ML. We show that, in several problems, choosing 
a large enough set of points to evaluate the CF, our method 
has the same power as the maximum likelihood, while being 
two different methods. 



2 THE MODIFIED x -TEST 

In this section we will introduce the test, using for this pur- 
pose the two-point CF. Nevertheless, all the procedure de- 
scribed below can be applied to any other estimator. 

For a certain map of the CMB anisotropics, X = 
{a;i, a;jv} with pixels, and errors a — {ai, ajv}, we 
can estimate the CF, C{6), in a set of n angular distances, 
{6k}k=i- In this work we have used the following estimator. 



E[c(ek)] = 



(1) 



i,je{fc} 



where {k} stands for the set of all pixel pair (i, j) such that 
their angular distance is dk, but our proposal and techniques 
can be applied to other estima tors for the CF (see, for e x- 
ample. iKashlinsk v et aP JlooJ), or lSzapudi et alj (l2001bM . 
Hereafter, we will write the estimate of a certain parameter 
as If we have a map with zero mean and no noise, eq. 

Q is a unbiased estimator of the theoretical two-point CF, 
which for a experiment with a symmetric beam is given by 



e=2 



(2) 



where Wi stands for the window function of the experiment. 
We have explicitly removed the dipole contribution in the 
previous equation, because these coefficients are dominated 
by the kinematic dipole in a real map. If the CMB signal is 
gaussian, the power spectrum (or its Fourier transform, the 
CF), encodes all the information about the model. So, under 
this assumption, we can parameterise a model through the 
Ce's themselves, or through the cosmological parameters. 



by writing d — Ce{n, Q, Qt, ^a, Ho, ■■■)■ In general, we will 
write C{6k\M), being M the parameters of the model. 

As an example of the general procedure that we propose 
in this paper, we consider in detail the estimator derived 
from the following statistic: 



2 

Xm 



fc=i 



{E[c{ek)]-c(ek\M)-CN{ek)f 

<T2(C(efc)) 



(3) 



where P^'s are certain weights to be defined below, and 
a^{C{9k)) stands for the variance of the estimator E[C{6k)]- 
CNiOk) represents the discrete CF of the noise. If we have 
an experiment with uncorrelated noise, this function takes 
the form 



Civ(eft) = 



0, 



9fc =0 
9fc /O 



(4) 



This method is a modiflcation of the standard form of 
a x^-test, for the case when the error of each of the es- 
timates entering ||3J are independent and gaussianly dis- 
tributed (hereafter, we mean by standard x^-test the case 
when Pfc = 1, Vfc, and all the terms of the sum in ^ are 
independent). In the present case, the E[C{di)\ quantities 
follow very closely a gaussian, but are correlated. For this 
reason, we have introduced some weights (Pfe), that will be 
determined by minimising the dispersion of the estimator 
derived from equation as we will see in the next section. 
Those quantities will account for the different degree of cor- 
relation between terms, and in a general problem, they will 
be a function Pfe = PkiC'), where C'ij stands for the correla- 
tion matrix between the errors of the estimates of C{9i\M) 
and C{ej\M), i.e. 

c[, = c[,{M) =< I E[c{e,)] - c{e,\M) - Cjv(eo ) x 



X \E[c{e,)] - c{e,\M) - CN{e,)j > (5) 

so the variance a'^{C{6k)) is related to the C'-matrix by 

a\c{e,)) = c:, (6) 

The brackets < ... > represent an average over an ensemble 
of Universes, i.e., an average over realizations for one fixed 
CMB model. 

In principle, our construction seems to miss information 
about the correlations when compared to the usual x^ pro- 
cedure to analyse correlated datasets, because in equation 
^ only the diagonal terms of the covariance matrix (C') 
are explicitly shown. Nevertheless, as we will see in the next 
section, the Pk weights depend on the full covariance matrix 
and not on its inverse, so all the correlations implicitly enter 
in that expression. 

We will not make a detailed comparison between the 
usual x^ method with uses the full covariance matrix C 
(we will refer to this method as the "usual x^") ^ind our 
Xm- However, we will illustrate this with an example (see 
section 6.1). In addition, in Appendix A we present a brief 
comparison of some characteristics of both methods (xi/ 
and the usual x^) for the case of linear problems. It is an 
interesting result that, for gaussian linear problems, if we 
are estimating only one parameter, the estimates from both 
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methods are exactly equal, while being different statistics 
(i.e. they will give different probability contours). Hereafter, 
we will concentrate in our method, and its application to 
CMB problems. 

It is worth to notice that our Xm statistic is a different 
approach to the ML, in the sense that it provides different 
estimates and probability contours. However, in the case of 
CMB analyses, we will see below that it has a similar power 
to the ML, but avoiding the problem of the inversion of the 
covariance matrix. There is however a minor sense in which 
our test may formally be considered as an approximation to 
the ML. It is well-known that the ML is an asymptotically 
efficient estimator for our problem. Thus, in the limit of in- 
finite size of the sample (or for those problems where an effi- 
cient estimator exists, as in linear gaussian problems), then 
the ML is the only one statistic which renders the minimum 
variance, and thus any other estimator may be regarded as 
approximate. 

2.1 Estimate of the method 

Once we have constructed the Xm test, the estimate for 
this method is given by the set of values for the param- 
eters that minimise eq. 0, i.e. the solution to the set of 
equations dx\i{M)/dM = 0. For a given problem, we pro- 
ceed as follows. If we want to estimate a set of p parame- 
ters, M = {Cii we first compute the C' matrix by 
assuming an initial value for those parameters, Mq. Using 
this matrix, we obtain our estimate by solving the following 
system of equations, 

dxli 



for the estimate of ct^j.^ by minimising eq. ^ with respect 



= 0, 



i = 1, 



(7) 



When computing these derivatives with respect to M, we 
neglect the dependence of C' on the parameters, which is 
equivalent to assuming that a^{C{dk)) and Pk are constants 
in the derivation. This process is iterated until convergence. 
The reason to keep the C' matrix fixed in the derivation is 
that we want to have an unbiased estimator of the parame- 
ters. 

The evaluation of the C' matrix can be done by Monte 
Carlo simulations, but we also propose an analytical ap- 
proach. It is possible to evaluate equation using the quan- 
tities < XiXjXkXi >, for a multivariate-gaussian field, as is 
the case for the CMB (see Appendix B). 

In the particular case of power spectrum estimation, the 
set of parameters we have to determine are the Ci's them- 
selves, or the band powers in a certain number of multipole 
bands centred at multipoles {£i, £p} (i.e., M = {C^j^Z^p. 
As the theoretical CF ^ is linear in the Ce's, we have that 
eq. Q is a linear system of p equations, and the solution 
can easily be found. 

As an example, we present here the equations for the de- 
termination of the total power measured by a certain experi- 
ment. In this case, we only have one parameter, M 
defined as 

'rsky = C{6 = 0°) = {AT/T)Ims = ■ 



-CeWi (8) 



which is essentially a normalisation of the spectrum. From 
here, we define the function f{9) = C(9)/ali^y, which is inde- 
pendent of a'^i^y. We can now obtain the analytic expression 



to a 



sky ^ 



which in this case takes the form 



Xm(o-, 



sky J 



{E[c{ek) 



sky 



HCiOk)) 



(9) 



For simplicity, we will not write the term of the noise CF, 
but it can be easily included inside the true CF. Inserting 
the previous expression in equation (|7|l, we obtain, for fixed 
Pk, an equation for alky It must be noted that due to the 
dependence of a^{C{9k)) on a^ky^ this equation is not ex- 
actly linear. We could solve it iteratively, starting with cer- 
tain fixed value for a^{C{9k))- However, for all the values 
of these quantities within the current limits, a first iteration 
is enough, as we will see, so that the equation determining 
u^ky is effectively linear, and therefore its solution is given 
by 



E[eT^ky] 



'iC{9,)) 



Y,p.n 



a^C{9,)) 



(10) 



The RMS of this estimator is given by 



RMS{E[aly]) = ^ 



P .2 f2 



{C{e,)) 



5Z (7^ 



^'iC(9.))J 

fifjPiPjCjj 



1/2 



(11) 



where we have defined fi = f{9i). The expression within 
large parentheses in this equation is the RMS of the second 
sum in equation llOt . The first sum within the parentheses 
correspond to the quadratic addition of the contributions 
of each term in IIUL which is present even when the ran- 
dom variables E[C{9i)] are independently distributed. The 
second sum is due to the correlations between any pair of 
these variables. It should be noted that equation lllll has 
been obtained assuming that the quantities E[C{9i)] follow 
a multivariate gaussian distribution. This is a good approx- 
imation if there are eno ugh pixel pairs entering in the sum 
in Q (see, for example, iHinshaw et al.l lll996bll . for the CF 
of the COBE data) . Similar calculations for the standard \^ 
and the likelihood function can be found in 'Be tancort-Riia 
(1993) (hereafter, B93). The matricial expression of the es- 
timate and the RMS for a general linear problem are shown 
in Appendix A. 



3 THE Pk QUANTITIES FOR A GIVEN 
PROBLEM 

The Pk's weights in equation are introduced in order to 
take into account the different degree of correlation of the 
terms of the sum. Their expression can be obtained once we 
define exactly what we are interested in. For example, one 
common criteria for one parameter estimation is to use the 
estimator which has the minimum RMS. We will consider 
this criteria here. 

For the problem of one parameter estimation described 
in the previous section, once we have the analytic expression 
for the RMS, and an initial guess for the C' matrix, we 
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can obtain the optimum set of P^'s using the "minimum 
RMS criteria". We minimise eq. IIIH with respect to the 
Pfe's quantities. We obtain that the P^'s quantities are given 
by the solution to the implicit set of equations 



Pkfk 



5Z a 



{C{e,)) 



fifjPiPjC'ij 



= 0, 



fc = 1, 



which can be solved numerically, using a Newton-Raphson 
scheme for nonlinear systems of equations. It should be 
noted that in the case when C'ij = 0, i 7^ j, equation 11211 
has the trivial solution P^ = 1, as we expected for the stan- 
dard case without correlations. The estimates obtained with 
these Pfc give us a better guess for C' , that could be used in 
equation I12II to obtain more appropriate values of the Pk- 
However, in practice, we have checked that for all the cases 
that we consider in this paper, this iteration is not necessary, 
since over the a priori uncertainty region of the parameter, 
the variation of the Pk is negligible. 

The previous expression, derived for the problem of to- 
tal power estimation, can also be applied to any problem of 
one parameter estimation, as follows. Let M be the param- 
eter we are interested in. If we expand the CF in a Taylor 
series around an initial guess, M = Mo, we obtain, up to 
first order, 

AC{ek\M) = C{dk\M) - C{ek\Mo) = 



dC{6k\M) 



dM 



AM + 0{AM'^), k = l,...,n (13) 



Mo 



with AM — M — Mo, so we can use equation 1121 . with 



fk = 



dC{ek\M) 



dM 



k = l, 



(14) 



Mo 



If we use an initial guess close to the real value, this 
linear approximation will give good results. The fk 's can be 
obtained numerically for each problem. 

When we deal with a problem of several parameters es- 
timation, it is not well defined what has to be minimised. A 
reasonable criteria for these problems, if we want to estimate 
the set M = {Mi,..., Mp}, is to minimise J], RMS{E[Mi]), 
where we define RMS{E[Mt]) as the RMS for each individ- 
ual parameter. Linearising around our initial guess. Mo, we 
can derive a simple expression for the RMS of each param- 
eter. In this case, we have that 
p 

AC(efclAf) =^/fc,,AAf, -I-O(AM^), k = l,...n (15) 

1=1 

where we define 



fk,: 



dC{ek\M) 



dM, 



i = 1, 



(16) 



Mo 



The estimate of the parameters is given by the solution to 
the linear system of equations dxjii/dMi — 0, where we 
have again neglected the dependence of a^{C{9k)) on the 
parameters. The general expression of the covariance matrix 



of the parameters is shown, for a general linear problem, in 
Appendix A. If we expand those matrices, we obtain that 
the general form of this estimate, for our problem, is given 
by 



E[AM.] = - 



J p E[C{9,)] 



c{ek,Mo) 



R ^ 

k 



(cm) 



i = l,...,p(17) 



where E\AMi\ = E\Mi\ — {Mo)i, and R and J are numbers 
■<Jb(tfe3iked from the fk,i's. From 11711 we can infer the general 
expression for the RMS in the case of several parameters, 
obtaining 



RMS{E[M,]) = 4 



E 

k,j 



Jk.iJjAPkPj 



-, 1/2 



kj 



a2(C(e,))a2(C(e,)) 



where i — 1, ...,p. Using the previous equation, we can ob- 
tain the Pk quantities for any problem, just minimising the 
product Y[ RM S{E[Mi]) numerically. Summarising, we will 
have a different expression for the P^'s for each particular 
problem. An application of these equations for the problem 
of two parameters estimation can be found in Section 6.1. 



4 DISTRIBUTION FUNCTION FOR THE 
MODIFIED 

The Xm proposed in eq. ^ corresponds to a sum of quan- 
tities which are not independent. If we had set Pi = 1, we 
would have the standard statistic, but still with correla- 
tions among the terms. Therefore, its distribution function 
will not be the standard one. 

The formal expression for the distribution of a con- 
structed from variables which are distributed following a 
multivariate gaussian distribution is given in Appendix C. 
This distribution, when applied to CMB analyses, was stud- 
ied in B93. There, they proposed that the distribution func- 
tion for the statistic ^ is given by an standard (rescaled) 
X^ function, but with an effective number of degrees of free- 
dom. This proposal is not exact (see also Appendix C), but 
it turns out to be a very good approximation for the true 
distribution function, as we shall see in the following sec- 
tion. In a general case, the error in the distribution function 
using our approximation will be a few percent. 

The basis of the approximation is to assume that corre- 
lations only reduce the degrees of freedom, but do not change 
the shape of the distribution ^ . Quantifying this argument, 
there exist a certain constant, A, which makes the statistic 



U^Axi 



(19) 



to be distributed as an ordinary x^ , with an effective number 
of degrees of freedom, rieff. This number, and the constant 
A, are obtained just by imposing the new distribution to 
have the mean and the variance of a standard x^j 



< f/ >= n, 



MS{U) =<U^ >- <U 



2n. 



(20) 
(21) 



^ This idea has b een used recentl y by other authors: 
IWandelt et"ai] i200(t : iHivon et alj i2002l) . 



(18) 
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where MS means mean square. Summarising, our proposal 
is that once the first and second moments are fixed, the 
whole distribution wiU foUow a very closely. For our prob- 
lem, we obtain 



A 



2 < xif > 



2<Xm > 
MS ixl,) 



-(22) 



J<T^(C(9.))<T^(C(e,)) 



(23) 



In this calculation, we needed to compute the MS of eq. (|3Jl. 
This result is obtained using the fact that the data points 
follow a multivariate gaussian distribution, and it can be 
found in B93 (see also Appendix B for similar calculations). 

In general, Uef / is a real number, so we have to consider 
the analytic extension of a standard distribution (which is 
known as the Gamm a distribution function, see for example 
IStuart fc OrdlJl994l) ). 



dFixl 



)dXn 



2"=//r(^; 



■exp( 



)(Xn 



-1 J 2 
dXn 



(24) 



just by replacing the factorial with the gamma function (F). 
In 12411 . dF is the probability of finding a value for U between 
X^n^fj and xii^fi + dXn^ff j and g stands for the probability 
density function. Once we know the distribution function, 
the confidence limits are given by integration bellow this 
curve. We assign a weight to each hypothesis as in a standard 
X^ analysis, integrating the distribution from the obtained 
value up to infinite. 



F(xl,. >U{X,M)) = 



U(X,M) 



gixi,„,neffiX,M))dxl 



(25) 



i.e., the probability of finding a value of Xn^ff bigger than 
or equal to U{X,A'I). Here, we explicitly write where the 
dependence in the data {X) and in the parameters (A/) is. 



5 CHECKING THE METHOD 

In this section, we will test the whole method in the problem 
of one parameter estimation, but first, we will study the 
quality of our approximation to the distribution function of a 
X^ with correlations. These two points will be done by means 
of Monte Carlo simulations. In order to d o that, we have 
chose n the JB-IAC 33 GHz Interferometer iMelhuish et alJ 
I1999I) as the reference experiment. 

This experiment is a two element interferometer, which 
operates at 33 GHz, at the Teide Observatory. It has two 
configurations, with angular resolutions 2° {i = 106 ± 19), 
and 1° {£ = 208 ± 18), respectively. The window function in 
both configurations is very narrow, so the results are quoted 
in terms of total power inside the band (band power). The 
experiment has g iven measurements on the power spectrum 
on both scales JPicker et al.1 119991 iHarrison et"al] l200Cll. 
which are consistent with the Boomerang data ld^^emajdia 
12000^ . We have the likelihood analysis implemented for this 
experiment, so the comparison with the new method will be 
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Figure 1. Distribution function for a with correlations with 
N = 10 terms. We show the histogram with the frequencies for 
the rescaled x^i obtained from 1000 realizations, for four different 
cases, varying w (signal-to- noise ratio). We use 10 bins of equal 
size to sample the distribution function. In all the figures, the 
dots represent the numbers coming from the realizations, and 
the error bars show their sampling error. The solid line is our 
approximation to the distribution function, using the value of 
rie// from the formula (shown within parentheses in the figure). 
It is also shown, using dot-dashed lines, the distribution function 
of a (rescaled) with N = 10. We can see the effect of the 
correlations as w increases. 



straightforward. In our analyses, we have used the compact 
configuration, and only one of the two channels (i.e., the real 
part of the complex visibility). 

The CMB realizations have been done assuming the 
following values for the cosmological parameters: n = 1, 
n ^ l,Q,b = 0.03, = 0.7 and Ho = 75fcm s'^Mpc-^. 
For this model, the total power inside the window function 
(or band power) is BP = 51.45/^7^ for the short configura- 
tion [l = 109 ± 18). This number is related with a sky by 
a conversion factor, which is obtained using the fiat band 
power approximation (i.e. BP — £{£ + l)Ci/2n constant 
inside the window function) in equation ((gj. For our in- 
strument, this conversion factor is BP — BAAasky, which 
gives Gsky = 9A6jj,K for the previous model. For this ex- 
periment, the sensitivity in a 30s integration is given by 
CTnoise ~ 250 f^K / Ndays, where Ndays is the number of ob- 
serving days. Therefore, the signal-to-noise ratio is given by 

_ 2 / 2 
^ ^ sky / ^noise • 



5.1 Our approximation to the distribution 
function 

The first point is to check the validity of our approximation 
to the distribution function for a x^ with correlations. We 
will study the case when the Xi quantities entering in the 
X^ follow a multivariate gaussian distribution. This is the 
case for an (ideal) CMB map, where the temperature at 
each pixel has two contributions, one coming from gaussian 
noise, and another one from the cosmological fiuctuation 
field, which is supposed to be also gaussian. Using the CMB 
terminology from Section 2, we will study the distribution 
of the statistic 
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Table 1. Values of n, 



ff 



and A, for TV = 10. 



w 


Theoretical values" 


Numerical values'' 




A 




A 


0.34 


8.4 


0.84 


8.4 


0.84 


0.69 


6.7 


0.67 


6.8 


0.67 


1.37 


5.0 


0.50 


4.8 


0.47 


2.74 


3.9 


0.39 


3.8 


0.37 



" Computed using eqs. I22i and 1231 . 

^ Computed using 100 CMB realizations, from the numerical 
value of < > and MS{x^). 



(26) 



sky 



and we will compare it with the proposed approximation. 
This is a particular case of (0 , when Pi = 1. We will study in 
detail this case here, but our results are completely general. 
It should be noted that our proposal is exact, by definition, 
in the two limit cases of no correlations at all, and totally 
correlated points. The first one correspond to the definition 
of the distribution function, and the second one is the 
case of a X with A'^ = 1. 

In Appendix C we present the formal aspect of the dis- 
tribution function for 12611 . and we study in detail, analyti- 
cally, the case N = 2. The cases with low A*' turn out to be 
the critical ones, because the shape of the distribution func- 
tion differs strongly from a gaussian. In the limit of high A*', 
both our approximation and the real distribution function 
tend to a gaussian distribution (the same one, by definition), 
as a consequence of the Central Limit Theorem. Therefore, 
it is interesting to test our proposal for a intermediate range 
of values of A'^. We have done so, and we will present here, 
as an example, the case for A'^ = 10. 

We generate CMB realizations with noise, for a ten pix- 
els map. From each simulation, we compute 12611 . and from 
the whole set of values obtained, we study the histogram 
with the frequencies, and we compare it with the proposed 
one, for several values of the signal-to-noise ratio. We show 
these results in Figure^ In general, the asymptotic shape of 
the distribution is very well reproduced, and the largest dif- 
ferences always occur for low values of ■ This is precisely 
the kind of approximation we need, because in a statistical 
analysis we usually are interested in the tail of the distribu- 
tions. 

We can see that the distribution of the x^ with cor- 
relations among terms is compatible, inside the numerical 
precision, with an standard (rescaled) x^y with an effective 
number of degrees of freedom, smaller than A'^. 

We have also checked that the numerical values for n^f / 
and A are correctly given by equations l|23|l and 1)22^ . In 
Table^we compare the values obtained from the simulations 
with the predicted ones given by the theoretical formulae. 
Their difference is in all cases smaller than the sampling 
errors, so we conclude that our expressions give the correct 
values for these parameters. 



5.2 Applying the method to one parameter 
estimation 

We will now test our method in the problem of one pa- 
rameter estimation. The idea is to compare, by means of 
Monte Carlo simulations, our method with the Maximum 
Likelihood on the full map, which is widely accepted as the 
optimal method for CMB analyses. We will consider in de- 
tail the problem of determining the total power measured 
by a given experiment, so our parameter will be f^j.^. 

Both the Xm and the ML are, by construction, almost 
unbiased methods for determining the total power of an ex- 
periment. In order to compare them, we have studied the 
power of each one. The power of an statistical method, when 
determining a certain parameter, is characterised by the 
RMS of the estimate of this parameter. The smaller this 
value, the more powerful is the method. To compute it, we 
will use Monte Carlo simulations for a fixed CMB sky plus 
simulated noise. We will consider different values for the 
signal-to-noise ratio, and for each one, we will obtain the 
RMS for each method. Finally, we will also compute the 
degree of coincidence of both methods, which can be pa- 
rameterised through the quantity RMS{ML,Xai), defined 
as the RMS of the difference between their estimates. 

For the experiment we are considering, we have gener- 
ated simulations for an observation of declination -|-41°, and 
R.A. range 8h-18h, for a fixed CMB signal {asky = 9A6fiK). 
Each realization contains, for a single channel, 300 data 
points, with a pixel size of 0.5 degrees. Given that this ex- 
periment has a narrow window function, the band power is 
directly a measurement on the power spectrum. 

The values of the Pk's turn out to be not critical in this 
problem. The solution to the equation 1121 is Pk ~ 1, Vfc, 
so we use here Pk = 1. The results of the realizations are 
summarised in Table |21 We conclude that, in this problem, 
the maximum likelihood on the full map and the Xia method 
have the same power, within the uncertainty. 

If we study the degree of coincidence of both methods, 
by computing the quantity RMS{ML,xli), we conclude 
that both methods are highly coincident when the signal 
to noise is low. For example, for w = 0.13, the RMS of 
both methods is ~ llBO/iK , and the dispersion between 
estimates is 590 fj,K^ , roughly half the RMS. So, not only 
the power of the methods is similar for low w, but also the 
estimates are. For high values of w (w > 0.6), both meth- 
ods tend to be independent, in the sense that the degree of 
coincidence RMS{ML,x\i) approaches to the value of the 
RMS. 

Once we have the (approximated) distribution function, 
we can determine the confidence region for one particular 
experiment. Usually, the size of this region is given by the 
68% of the probability. Our error bars has to be interpreted 
in a frequentist way. That is, if we make lots of simulations, 
the true signal will lie inside the confidence region of each 
realization the 68% of the times. It is known that this value 
does not necessarily represent the 'error bars' defined in the 
usual Bayesian way, by treating the band-power likelihood 
function as a probability distribution. 

As an example of application of our method, we have 
analysed the data from Dicker et al. (1999). These data cor- 
respond to declination -1-41°, observed with the low resolu- 
tion configuration {£ — 106 ± 19). The value obtained using 
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Table 2. Comparison of the power of the ML and the methods for a single parameter estimation: the total power measured by 
an experiment (t^j,^). The values were obtained from Monte Carlo simulations of the model a^i^y = 9A6^K, which corresponds to 
BP = (blA^p.K)'^ = 2647/i-fsr^. We quote the results in terms of band power (BP). 



x\,f method ML method 



(fiK/pixel) ilJ-K^) (pK'^) (m-K"^) 



90 


26.35 


0.13 


7.27 


1170 


1150 


590 


120 


22.82 


0.17 


6.94 


980 


1065 


520 


240 


16.14 


0.34 


4.50 


800 


805 


410 


480 


11.41 


0.69 


3.43 


640 


680 


410 


960 


8.07 


1.37 


2.95 


570 


640 


420 


1920 


5.71 


2.74 


2.67 


530 


570 


440 


3840 


4.03 


5.51 


2.65 


500 


540 


490 



" We report here the RMS for both methods, obtained from 500 CMB plus noise realizations, for different signal-to-noise ratios {w). 
The sampling uncertainty in the RMS can be obtained as ""Ij^^g = ^RM S'^ /Nrsalizationa- We can see that the values coming from 
both methods are compatible, inside that uncertainty. We have also checked that both methods give unbiased estimates, i.e. 
< E[a'^i^^] >= We do not include the values for < E[a^^^] > in the table for clarity. 

^ We report here the degree of coincidence of both methods, i.e. the quadratic dispersion of the estimates from both ones. The values 

were obtained from 12 CMB plus noise realizations. 



the likelihood analysis is AT* = 4315^2^7^. In order to com- 
pute the C', we use a value for AT = 40/iA'. In any case, 
our result does not depend on this initial value, and starting 
with another one (AT — 20fiK or AT — 60jj,K, for exam- 
ple) gives the same result in the first iteration. The estimated 
value using our method, with Pk = 1, is AT = 
where the CL. are defined as the 68%. The effective num- 
ber of degrees of freedom is this case was rie// = 18.14, and 
the total number of points where the CF was evaluated was 
n — 20. The lag used for sampling the CF was 0.67 deg, but 
the result is not sensitive to changes in this number. 

We have obtained the same estimate as the likelihood, 
but our confidence region is smaller. As we have pointed 
before, this probably is due to the fact that the confidence 
region has a different definition in both methods. In order 
to compare those confidence levels, we perform Monte Carlo 
simulations, using the measured signal and the experimen- 
tal noise, and we obtain the equivalence between the confi- 
dences levels for both methods. We conclude that the region 
that contains the 68% of the area of the likelihood around 
the peak, corresponds to ~ 75% of the probability in a fre- 
quentist sense (i.e., that region contains the true signal the 
~ 75% of the times). This explains why the likelihood give 
us bigger error bars in this particular problem. In a general 
case, we will have to repeat this analysis for the likelihood 
estimate, in order to compare the sizes of the confidence 
levels. 



6 ESTIMATING SEVERAL PARAMETERS 

In the previous section, we have proved that our method, 
when constructed from the CF, has the same power as the 
likelihood when determining a single parameter (an overall 
normalisation). We now probe if this is true for a larger 
number of parameters. 

For the case of several parameters estimation, the CF 
has proved to be a very good statistic in determining the 



power spectrum of the CMB dSzapudi et alJl2001a^ . In their 
paper, they obtain the C^'s by a Gauss-Legendre integration 
of the CF. When applied to simulations of the Boomerang 
data ide Bernardijl2000r) . the error bars for that method, 
coming from Monte Carlo simulations, are of the same order 
as the sample variance, which is the theoretical limit to the 
size of the error bars. 

Here, we will apply our method to the COBE Differen- 
tial Microwave Radiometer data in order to obtain the C^'s 
in two cases. The first one, using the power law parameteri- 
sation of the angular power spectrum, in terms of the spec- 
tral index of the primordial spectrum, n (with P{k) oc fc"), 
and the normalisation parameter, Qrms-ps- In this case, 
using our notation, we have M — {n,Qr,n3-ps}- The de- 
pendence of the Ce's in these parameters, for a pure Sachs 
& Wolfe spectrum (which dominates in the consider ed mul- 
tipole range), is given by jBond fc Efstathioulll987[l 



_ 4^ r{£+^)TC-^) 



(27) 



The second case will be to estimate several Ct's directly, i.e.. 



6.1 Estimating (n,Qrm3-Ps) for the COBE data 



CF has been applied for COBE analyses of 
n ,Qrms-Ps) p arameters by other authors. In 



The 

the (n,i^rms- ps 

iHinshaw et alJ 1 1996bfl . the quadrupole normalisation is in- 
ferred using Monte Carlo-based gaussian likelihood analy- 
sis for a scale-invarian t (n = 1) power- law spectrum. In 
iBunn. Hoffman, fc Silkl l)l994h (hereafter BHS94), they de- 
termine n and Qrms-ps as the best fit values to the com- 
puted CF. They show, by means of Monte Carlo simula- 
tions, that this method is not optimal for the determination 
of those parameters, because they obtain a large RMS when 
trying to recover them simultaneously (see Table 2 in that 
paper). Nevertheless, other estimators, such as the direct 
evaluation of the atrn, give smaller RAlS's. We will use here 
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the CF, but with our method, to probe whether we obtain 
good results. 

In order to test our method with these two parameters, 
we perform Monte Carlo realizations of COBE like maps, 
using a scale invariant power spectrum (n = 1) with a nor- 
malisation Qrms-ps = — 324fiK^ . We will use 
the standard COBE pixelization of = 6144 pixels (in- 
dex level 6), in galactic coordinates. We include in our re- 
alizations the noise level corresponding to the combined 4- 
year COB E map of the th r ee freq uencies, using the weights 
quoted in iHinshaw et"ai] lll996bl) (this map is referenced 
there as 31-1-53-1-90). The CF has been sampled using a 2.6° 
step, as it appears in that paper (that size correspond to the 
typical pixel size) . Anyway, we have checked that our results 
are consistent when changing that step. In our calculation 
of the CF, we use a galactic cut |6| > 20°. 

We will use the Pk quantities given by the minimum of 
the function RMS{E[n]) x RMS(E[Qrms-ps]), as we have 
discussed in Section 3. Those RMS's can be derived, using 
the linear approximation to the CF, from equation 11811 . In 
this problem, the R and J quantities are given by 



R 



\ ^ Pifi,l \ I \ ^ Pifi.2 

2^ a^iCm ) I 2^ a2(C(eO) 



E Pifi,lfi,2 



10 



10 



20 



30 40 
k 



60 



70 



Figure 2. Optimum set of P^'s for the 31 + 53-1-90 4-year COBE 
map (see details in the text), using the galactic cut |6| > 20°. 
These values were obtained by numerical minimisation of the 
product RMS[E[n\) X RM S{E[Qrms~ P s\) , using the linear ap- 
proximation to the CF. Each value for fc (1 ^ fc 70) corresponds 
2 to an angular distance, given by Sj. = (fc — 1)2.6°. An explanation 
of this peculiar structure can be found in the text (Section 6.1). 
(28) 



Jk.l — fk,l 



PjfjAfj, 



J, 2 



^ <72(C(e,)) 



(29) 



where 1 stands for n, and 2 for Qrms-PS- The equation for 
Jfe,2 can be obtained from Jfc,i, just interchanging 1^2. 

In order to check the previous expressions for the RMS, 
we use 100 of the above mentioned realizations of COBE 

1/2 

like maps (n = 1; Qj^s-ps ~ 18/^^^), and we analyse them 
using several sets of Pfc's. In this way, we can obtain the 
real value of the RMS, and compare it with the number 
coming from the formula. The results are summarised in 
Table |H] In all cases, the theoretical numbers obtained from 
equation (1181 are in agreement with the numerical results, 
so we conclude that the linear approximation to the true CF 
works well in computing the RMS. The largest differences 
occur when we obtain a large RMS, due to the fact that, in 
that case, fails the linear approximation to the CF. 

The average values recovered for n and Q from Monte- 
Carlo simulations show that the estimator is unbiased, as 
we would expect. It should be noticed that, in Table |5] 
the effective number of degrees of freedom is quite small. 
In all cases, we obtain rie// < 3, but we are using n = 70. 
The reason is that the CF contains long-range terms, com- 
ing from low multipoles {£ ^ 2). This fact reduces the de- 
grees of freedom drastically, so the choice of the Pk will 
be critical in this problem. The numbers obtained when we 
consider the whole CF and Pk = 1 are compatible with 
those in BHS94, but slightly better because we consider the 
noise of the 4-year COBE map. In that paper, they ob- 
tained, using the same galactic cut (|6| > 20°), and the 
noise from the 2-year map, the values RM S{E[n]) — 0.96 
and RMS{E[Qrni3-Ps]) = 253^4^'^ (in our units). Never- 
theless, we see that considering only the first points, and 
setting to zero the others, strongly reduces the RMS of 
the estimate, even below the values obtained when they do 
not consider noise and incomplete sky coverage (they have 
RMS{E[n]) = 0.36 and RMS{E[Qrms-ps]) = 175^^^^). 



Finally, we can obtain the optimum set of Pk's by nu- 
merical minimisation of the prod uct of RMS's. W e have 
used the Fortran program amoeba JPress et alll986l . p.402). 
The obtained value for these quantities is shown in Figure 
|5| These numbers do not depend on the initial guess for n 
and Qrms-PS within the a priori region of uncertainty. The 
values obtained have a peculiar form, but it can be under- 
stood as follows. We see that the terms which contribute to 
the optimum estimator (minimum RMS) correspond to the 
first ~ 10 degrees, i.e. the first part of the CF, as we would 
expect. But there are also two regions, one at ~ 50°, and 
another at ~ 125°, which contribute to the estimator. These 
peaks are located just in the zeros of the quadrupole {£ — 2) 
which is the multipole with the largest cosmic variance. By 
using those points, we add some information to the estima- 
tor (coming from higher i values), but we do not increase its 
variance. In any case, to consider or not those points do not 
affect too much to the power of the method (in Table 13 the 
values obtained when using the first 10 points from the CF 
are close to those obtained with the optimal set). Following 
this interpretation, one could think about other combina- 
tions of the Pk parameters using others points (for example, 
taking two points at ~ 40° and ~ 105°. We have explored 
this possibility as an illustration, and we find the values 
RMS{n) = 0.31 and RMS{Qrms-ps) = 144 which 
are similar but slightly higher than those obtained for our 
optimal Pfe. Thus, we can conclude that for this problem, we 
can find a set of estimators (one for each one of these sets of 
Pk values) which give similar RMS values when estimating 
n and Q, and as we show below, these values are comparable 
to those given by the likelihood method. 

Finally, we have applied our xlf-test for the CF to the 
actual 4-year COBE data. Our estimates, from the analysis 
of the 31 + 53 -1- 90 map, using the galactic cut |fej > 20°, 
a step of 2.6° to sample the CF, and the optimum set of 
Pfe's, are E[n] = 1.08, and E[Ql'^^_pg] = 1^.2nK, so our 
estimate of the parameters, using the true RMS from 100 
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Table 3. Results of Monte Carlo simulations for the determination of n and Qrms-PS with the x|,/ method, using different sets of 
Pk parameters. We explore the cases of: (a) an uniform value of for k = 1, kmax and zero the others (first four rows, quoted 

1/2 

as 'Uniform, kmax'); and (b) the optimum set of P^. values. The MC simulations have parameters n = 1 and Q^ins-PS ~ l^M^i ^nd 
the correlation function is sampled at 70 equally spaced bins of size 2.6°. We use the galactic cut |6| > 20°, and the noise levels of the 
combined 4-year COBE map. 



Pk " 


n 


Qrms-PS [M-R"^] 




RMS{n) ° 


RMS{Qrms-Ps) [f^K^] " 


Uniform, 70 


1.04 ±0.54 


323 ± 184 


2.55 


0.60 


190 


Uniform, 41 


1.02 ±0.50 


331 ± 178 


1.92 


0.52 


177 


Uniform, 10 


1.08 ±0.34 


307 ± 147 


1.12 


0.28 


133 


Uniform, 3 


0.99 ± 0.45 


327 ± 145 


1.05 


0.34 


147 


Optimal 


1.03 ±0.28 


316 ± 141 


1.78 


0.19 


114 



" Adopted values for the P^'s. The last row is the optimum set of Pj, values (see Figure 1^. 
^ Results from 100 Monte Carlo simulations of COBE data (see details in the text). The first number is the average value for the 

parameter, and the second one is the RMS from the simulations. 
RAIS values obtained analytically, using the linear approximation to the CF (see Section 3). 



1/2 

realizations, will be n = 1.08 ±0.28, and Q^ins-ps ~ 15.2 ± 
3.5fj,K. This result has to be compared with the lik elihood 
analysis using these data (see lHinshaw et all 1^963), Table 
1). They obtain for this map n = 1.25t°;29, Qlms-ps ~ 
15A'^2 gf^K . The estimates from both methods are nearly 
the same, and now the error bars coming from the CF are 
compatible in size with those coming from the maximum 
likelihood method. For comparison, using no weights {Pk = 
1), we would obtain E[n] = 1.10 ±0.54, and -E[Q^^Vps] = 
(16.0 ± 4:.8)fiK, so we can see that using the Pk's for this 
problem is essential. 

6.2 Estimating band power spectra for COBE 
data 

Finally, we will apply our method to obtain band power 
spectra for the COBE data. We have used the realizations 
from the previous subsection [n — 1; Qli^g_ps ~ ISfiK), 
and the sam e COBE map. We will compare our results with 
those from iHinshaw et alJ lll996al) (see Table 2 in that pa- 
per). We have used exactly the same £ range: four multipole 
bands between £ — 2 and £ = 40. Those bands are: 2 ^ £ ^ 5; 
6 s; ^ sC 10; 11 ^ sC 20 and 21 sC £ ^ 40. Using those real- 
izations, we have checked that the method is unbiased when 
applied to power spectrum estimation. 

When applied to the 31 ±53 ±90 4-year COBE map, we 
obtain the results that are shown in Table |1] We quote the 
band power values in terms of the quadrupole normalisation 
expected for a scale-invariant power-law spectrum within the 
specified range of £. The quoted values for our method have 
been obtained from the optimal set of Pk for this problem. 
When compared with the ML data, we can see that the error 
bars are of the same order in both cases, as in the previous 
subsection. So we again obtain a method of a similar power 
to the likelihood on the pixel map. 

The estimates from both methods are consistent in all 
bins, except the apparent inconsistency at the second one, 
where the two estimates differ in more than 3 times the size 
of one error bar. Nevertheless, the statistical significance of 
that deviation has to be computed as follows: given that 
we do not know the true value for the band power, when 
comparing two results we have to consider the difference of 



Table 4. Band power values for the COBE data (31±53+90 
map)". 







Multipol 


e Band 




Method 


2 <^ ^ 5 


6 ^ £ < 10 


11 s; ^ < 20 


21 ^ £ < 40 


xif- optimal 
ML 


17.0 ±3.3 
18.6l« 


9.0 ± 2.6 

ID. (_2 


17.6 ± 1.8 

2O.3I2:? 


0±4.4 

1 0+^^-^ 

^■^-1.0 



" These band power amplitudes are expressed in terms of Q^^^, 
i.e. the quadrupole normalisation expected for a scale-invariant 
power-law spectrum within the specified range of £. The units are 
fj.K. 

ML values from lHinshaw et al ]E9963) for the same map. 



both estimates, and so we have to compute the variance of 
the difference. In this case, the difference is 16.7 fiK — 9fiK = 
7.7fiK, and the RMS of the difference of those two estimates 
is RMS ~ \/2.6^ + 2.2^ ~ SAfiK. Here we are assuming the 
fact that both methods are almost independent, following 
the results for the case of one parameter estimation, with 
signal-to- noise ratios of the order of 1. Therefore, we find 
a deviation at the 2.3-sigma level, which corresponds to a 
fiuctuation of 1 in 47 (for a normal distribution). This can be 
understood given that the ML and the Xm ^.le two different 
methods: the first is based on the full map, and the second 
on the correlation function. Therefore, the estimates will be 
different in general, although as we can see, both methods 
have similar power, so there is no reason to consider any one 
of them as "the estimate" . 

Summarising, we have seen that it is possible with our 
method to perform an analysis with a similar power to the 
ML, even for the case of several parameters. 



7 DISCUSSION AND CONCLUSIONS 

In this paper we have presented an statistical method to 
analyse CMB maps. It consists in a variation of an stan- 
dard x^-test for the case when we have correlated points. 
Here, our test has been constructed from the two-point cor- 
relation function, following the proposal from other authors 
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jSashinskv fc Bertschingeill200j) that the CF contains aU 
the relevant information concerning the cosmological pa- 
rameter estimation. In this hne, we propose a Xm based on 
the CF, which is a different approach from the "usual x^" 
method (which uses the full covariance matrix C'). Our pro- 
posal explicitly uses only the diagonal terms of the covari- 
ance matrix of the data, but we introduce certain weights, 
which now implicitly contain all the correlations. This ap- 
proach has two important 'computational' advantages com- 
pared with the "usual x^'\ or with a likelihood based on the 
CF: 

• we do not need to invert the covariance matrix, C'; all 
the quantities (the XM) "^eff, A and the expression for the 
RMS) depend on C" directly. 

• even more, if we have a problem with a low value for 
rie//, the effective number of Pk to use (i.e. the number of 
Pk's which are significantly different from zero) will be also 
small, typically, of order 0(n^ff). So it is not even necessary 
to use all the terms of the diagonal. 

These advantages are more important when compar- 
ing our method with the standard ML analysis on the full 
map. We do not need to invert the covariance matrix of the 
map (A'^ X A'^), and we only need to concentrate on a few 
numbers, so the problem is computationally accessible if we 
have to deal with large datasets. It it important to stress 
here that the Xm is not an approximation to the ML on the 
full map, but a different approach (i.e. both methods will 
give different estimates and probability contours for a given 
problem). So the xlf can be applied to any problem, but 
it has to be checked, by means of Monte Carlo simulations, 
that the method has a similar power to the maximum like- 
lihood. As we have seen, this is the case for several CMB 
common problems (power spectrum estimation, and cosmo- 
logical parameter estimation). 

The largest computational effort in our method has to 
be done estimating the CF, which is a ~ A'^ operation. 
Never theless, there are estimators for the CF m ore effi- 
cient llKashlinskv et al.ll200ll: ISzapudi et"ai]l2001bft . so our 
procedure could be applied to current WMAP data, and 
PLANCK simulated data, but this will be treated in detail 
in future works. 

Our method can be extended for general noise covari- 
ance matrices. We only need to compute Cn{9) in the same 
way as the CF, and introduce it in equation If we have 
the noise matrix, it is straightforward to obtain Cn{9)- But, 
if we do not have the noise matrix, we can obtain an esti- 
mate of the correlation function of the noise using MC sim- 
ulations. This idea ha s been used recently by other authors 
JSzapudi et al.ll2001b^ . It must be noted that if the noise 
changes substantially from pixel to pixel, then we would 
have to use weights in equation Q to compute the CF in a 
more efficient way. 

We have also presented an approximation which gives 
very accurately the distribution function for a con- 
structed from a set of multivariate gaussian variables. This 
proposal can be extended to approximate the distribution 
function of any quantity made of a sum of squares, each 
of them distributed (exactly or approximately) following a 
gaussian distribution. 

To conclude, we propose that, if we are interested in 
obtaining a certain set of parameters, M, we can use an 



unbiased estimator of certain quantities depending on these 
parameters, provided that they contain all the relevant in- 
formation to these parameters (in our case, we have used the 
CF at certain angles). From it, we may obtain, by varying 
the Pk's, the best estimator of those parameters. In this 
paper, we have tested this proposal, using the two-point 
CF as the reference estimator, in CMB problems. For the 
case of one parameter estimation (the normalisation of the 
spectrum), our method, with the CF, turns out to be as 
powerful as the ML. When applied to COBE data, we have 
shown the importance of choosing the right set of Pk's. In 
the optimum case, we obtain a value for the RMS two or 
three times smaller than the one obtained without weight- 
ing at all. In this problem, the Pk's are critical because the 
effective number the degrees of freedom is very small. The 
reason is that when we have strong correlations {rieff small 
compared with n), the structure of these correlations, which 
is encoded in the Pfc's, will be relevant, so there will be 
a considerable difference between the optimal weights and 
Pk — 1. When analysing CMB data which contain large 
scales (low multipoles), we are considering correlations over 
long distances. All the points are correlated with the oth- 
ers due to these low multipoles (quadrupole, octupol e,...). 
In the case of Boomerang data JSzapudi et al.ll2001a^ . the 
effective number of degrees of freedom will be larger (if we 
throw away the large scales), so the Pfc's will be closer to 
1, and a standard test based on the CF should produce 
good results. 
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APPENDIX A: RELATIONSHIP BETWEEN 
THE USUAL x^ AND THE xli FOR GAUSSIAN 
LINEAR PROBLEMS 

In this Appendix we will study the relationship between the 
usual x^ analysis (the standard approach for the case of cor- 
related data points, which uses the full covariance matrix), 
and our approach, for gaussian problems in which the model 
depends linearly on the parameters, and with a covariance 
matrix independent of the parameters. This case is not ex- 
actly equal to the usual one in CMB analyses, but it is very 
close and is particularly suitable for illustration. 

We will use here the following notation. Let y = 
{yi, ...jy-n) be a 1 X n matrix containing the n data points, 
which, by hypothesis, are distributed following a multivari- 
ate gaussian distribution. Let a = (ai,...,afe) be a 1 x A; 
matrix whose elements are the k parameters of the model. 
Let X be a A; X n matrix, also given by the model. Their 
elements are defined so that the mean value of y, < y >, 
is given by the matrix multiplication ax. Finally, let M be 
the covariance matrix of the problem, defined as 

M =< (y - ax)'^(y - ax) > 
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where T stands for the transpose. Using the previous def- 
initions, the usual Bi,nd our Xai ^-re given, respectively, 
by 

2 



X 



(y-Qx)M ^(y-Qx)' 



Xm = (y - "x)V (y 



(Al) 
(A2) 



where we have defined the matrix V using the diagonal of 
the covariance matrix, and our weights (Pi), in the following 
way: — Ma/ Pi, for i = l,..,n, and Vij — for i 7^ j. 
It should be noted that the V matrix depends implicitly on 
the weights (-Pi). 

For this family of models under consideration, the op- 
timum estimator is the maximum likelihood, which is given 
by 



£ oc 



exp(-ix^) 



(A3) 



det{M) 

Until this point, we have not made use of the fact that 
the covariance matrix is independent of the parameters. If we 
use it now, from the last equation we have that the maximum 
likelihood reduces to the usual for this problem. The 
estimate for both methods is obtained by minimising the 
previous expressions with respect to the parameters, so we 
have 



"^x'^)(xM"^x'^) 



Ei[a] = (yM 



E2[a] = (yV-'x'^)(xV- 



(A4) 
(A5) 



Hereafter in this section, we will use subscript 1 for the stan- 
dard (usual x^) method, and 2 for the Xai- We compute 
now the covariance matrices of the parameters, W, which 
are given by 



Wj = Variance{Ei[a]— < Ei[a] >) 
W2 = Variance{E2[a]— < £2(0] >) = 
= (xV"'x'^)~^(xV"'MV"'x'^)(xV"'x'' 



(xM^'x^)"^ 



(A6) 



(A7) 



Using this notation, the MS for each parameter is given by 
the corresponding element in the diagonal of W. 

The following point is to compare the estimates of both 
methods. For the case of one parameter estimation, it can be 
argued that both method give the same estimate, and there- 
fore have the same RMS. The argument is as follows: in this 
case, the estimate for both methods is a linear combination 
of the n quantities y. So it could be possible, in principle, 
to fix the n quantities (Pi) to equalise the n coefficients 
in expressions IIA4t and (IA5II . Given that for this partic- 
ular problem the usual is the optimal method (i.e. the 
one with the minimum variance), and that the correspond- 
ing estimator is the only linear one for which this variance 
is obtained, those Pi quantities which set equal the coeffi- 
cients in IIA4t and 1A5II . are exactly the same that would 
be obtained by minimisation of the RMS of that param- 
eter. We have checked this statement for the critical case 
where we have a with only n = 2 terms. In Figure lAll 
we present several particular examples, showing that for the 
set of Pi quantities that minimise the RMS for the Xai: we 
always obtain the same RMS as in the case of the usual x^- 
We have also checked that, for those values of the Pi, the 
estimates from both methods are exactly the same. 

Let's consider now the case of several (fc > 1) parame- 
ters. For this problem, it is not possible in general to obtain 




Figure Al. RMS for the standard method with correlations, 
and our xf^ method, for several linear gaussian models with only 
one parameter, a, and n = 2 variables, named {yi,y2), normally 
distributed with means < yi >= 0x1 and < j/2 >= ax2. We 
also assume a covariance matrix independent on the parameter, 
and we parameterise it as Mn = cr^ = 1, M2 = ""2 = fi ^^'^ 
M12 = M21 = C12, but these results are completely general. 
We plot here four typical cases for this problem. The dot-dashed 
line in all four panels corresponds to the usual value for the 
RMS in the estimate of a. The solid line is the RMS obtained for 
the Xm' using the shown value of Pi. The P2 is obtained from 
the normalisation equation Pi -I- P2 = 2. We see that it is always 
possible to find a certain value of Pi , for which we have exactly 
the same RMS for both methods. For that Pi value, the estimates 
are also obtained to be equal (see text for details). 



a set of Pi quantities which render the RMS of the usual 
X^. In any case, we have checked that it is always possible 
to choose those quantities to make the estimate for just one 
of the parameters exactly equal, and so its RMS. We have 
studied this problem in detail for the case of n = 3 terms in 
the x^, and k = 2 parameters. The results are the following: 

• if we choose to minimise the RMS of only one param- 
eter to find the optimal Pi quantities, we can always make 
the estimate for that parameter exactly equal to the usual 
X^. For those Pi values, the estimator for the other parame- 
ter is very close to the optimum, in the sense that the RMS 
for the other parameter at the most 10% bigger that the 
optimal one. 

• if we minimise the product of the two RMS's, we find 
in all cases that both estimates are close to the optimal ones, 
and the largest relative differences between Wj and W2 are 
smaller than ~ 1%. 

Therefore, we can conclude that the criteria to obtain 
the Pi quantities has an small ambiguity, in the sense that if 
we are interested in one parameter in particular, we should 
minimise the RMS for that parameter only. In practice, 
this ambiguity is not relevant because, for this problem, the 
differences between the estimates and the obtained RMS 
values are negligible. Therefore, we will maintain the orig- 
inal proposal of minimising the product of RMS's, which 
in some sense is equivalent to minimise the average size of 
the confidence region. All these arguments can be applied 
to cases with k > 2 parameters. 
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APPENDIX B: USEFUL QUANTITIES FOR 
MULTIVARIATE GAUSSIAN DISTRIBUTIONS 
UP TO 4TH ORDER 

The probability density function for a multivariate gaussian 
distribution of n variables (n— MGD), X = {Xi, Xn} , is 
given by 



(27r)"/21Cli/2 



exp 



^{X-XfC-\X^X)\iBl) 



where C is called the covariance matrix, \C\ stands for its 
determinant, and X is the mean of X. The elements of C 
are given by 



Cij — < [Xi — Xi)[Xj — Xj) > 



(B2) 



We define = Ca. When computing the mean square of 
equation , or the C" matrix given in Q , we need to know 
the following quantities: < XfX'^ >, < XfXjXk > and 
< XiXjXkXi >. We will obtain them here. For a 2-MGD, 
we can obtain the quantity 



2n J\C 



Xfxix 



X exp 



dXidX2 



= (Ticrl + 2Ci2 



(B3) 



For a 3— MGD, we obtain 

< XIX2X;, ■i\C\\A22Ay, - yl^3)(yliiA23 - ^12^113) 

-2\C\A2-i (B4) 

where Aij stands for the elements of the inverse of the C 
matrix. Using the same notation, for a 4— MGD, we obtain 

< X1X2X3X4 \C\ I 2A12A34 - A^iA2Z - ^13^24 I - 



-SlCr ^12^134 + ^14^24^33 ^14^23^34 " ^12^33^44^ 



-7413^24^34+^13^423^44 ^11^23^24 + A?2^34- 



^11^22^34 + ^413^14^22 - ^12^14^23 ^ ^12^13^24 (B5) 



APPENDIX C: PROBABILITY DISTRIBUTION 
FUNCTION FOR A CONSTRUCTED FROM 
MULTIVARIATE GAUSSIAN VARIABLES 

The moment generating function of the quadratic form = 
, Pj Xf I , where the X = \Xi, X„} variabl es follow 
a n— MGD with zero mean f eq. lHTl . is given by (see iMathait 
Jl993h : IStuart fc OrdI t994) ) 



G{t) = Y[{l-2tX,)-' 



/2 



(CI) 



In this equation, Ai, An are the eigenvalues of the matrix 
EC, where C is the covariance matrix IIB2II . and E is defined 
as Eij — {Pi/ai)Sij, being Sij the Kronecker-delta. From 
this expression, the distribution function can be obtained 
by the Laplace inverse transform. Formally, we have 



*„(x') = i-'[GM)] 



(C2) 



where L ^ stands for th e inverse Laplace transform 
(iGradshtevn fc Rvzhiklll98l p. 1142), and we use the nota- 
tion ^„(x^) for the exact distribution function of the x^- As 
an example, we will study in detail the case n — 2, compar- 
ing the exact distribution function with our approximation. 



CI Distribution function for n=2 

The analytical express ion for the distributi o n fun ction 
can be obtained using IGrad shtevn fc Rvzhild lll98Cf) . eq. 
(17.13.9), p. 1143, and the convolution theorem for Laplace 
transforms. We obtain 



*2(X 



2y^ 



exp 



4q/3 



(«+/3) 




(C3) 



where a and /3 are the two eigenvalues of the EC matrix, 
and Jo is the zero order I Bessel function. 

For this problem, is also easy to obtain the analytic 
expression for the k-order moment of the distribution, using 
the binomial expansion. We obtain 



i=0 



piP2 



E 

j=0 



(Ci2)''"'^2'=lCl^r(i + l/2)r(fc - j + 1/2) (C4) 



We compare both the distribution function and the k- 
order moments up to fc = 4 with the values obtained using 
our approximation. The results quoted here correspond to 
the case Pi = P2 = 1, and af = a2 ~ 1, so C12 varies in the 
range [0, 1]. Nevertheless, the results are completely general. 

For our problem, we can write a = I-I-C12, /3 = 1 — C12, 
and rieff = 2(1 -I- Ci2)~^. In Figure IHTI we present the ^'2 
function for the value of C12 which gives us the maximum 
percentage difference between the exact and the approxi- 
mated functions. This value corresponds to C12 ~ 0.825. 
The largest percentage difference in the distribution func- 
tion for this case is reached at x^ ~ 0.388, and has a 
value of ~ 17%. In terms of the weights, we obtain for 
this point a difference of a 13% ( W£j{true) — 0.73 , and 
WjQ[{approx) — 0.70). Nevertheless, the power of our ap- 
proximation is that the largest differences always occur at 
low values of x^- The asymptotic shape of the exact dis- 
tribution function is well reproduced, as we need for a 
analysis. 

To conclude, we show in Figure in2l the third and fourth 
order moments, both for the real distribution and the ap- 
proximation, in the whole range of values for C12. By defini- 
tion, the first and second moments are equal for the true and 
the approximate distribution. We see again that the approx- 
imation follows quite closely the true function, as we have 
found from the simulations in Section 5.1. 
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Figure C2. Skewness and kurtosis for a with N = 2 terms. We see that the approximation and the true distribution coincide in the 

two limit cases of C12 = (no correlations) and C12 = 1 (totally correlated terms). For comparison, we also show these quantities for a 
gaussian distribution with the same mean and variance as the exact one (see text for details). 



C„ = 0.825 



0.8 




5 10 15 



Figure CI. Distribution function for a with correlations, with 
N = 2 terms. We have used Pi = P2 = 1, and ai = a2 = 1. We 
show the case C12 = 0.825, because for that value we have the 
largest percentage difference between the true function and our 
approximation, which is given by = 1.2 and A = 0.60. We 

can sec that the largest differences occur at low values of • The 
asymptotic values are well reproduced. 
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